Goto

Collaborating Authors

 mean-field theory


Mean-field theory of graph neural networks in graph partitioning

Tatsuro Kawamoto, Masashi Tsubaki, Tomoyuki Obuchi

Neural Information Processing Systems

A theoretical performance analysis of the graph neural network (GNN) is presented. For classification tasks, the neural network approach has the advantage in terms of flexibility that it can be employed in a data-driven manner, whereas Bayesian inference requires the assumption of a specific model. A fundamental question is then whether GNN has a high accuracy in addition to this flexibility.



Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory

Neural Information Processing Systems

Temporal-difference and Q-learning play a key role in deep reinforcement learning, where they are empowered by expressive nonlinear function approximators such as neural networks. At the core of their empirical successes is the learned feature representation, which embeds rich observations, e.g., images and texts, into the latent space that encodes semantic structures. Meanwhile, the evolution of such a feature representation is crucial to the convergence of temporal-difference and Q-learning. In particular, temporal-difference learning converges when the function approximator is linear in a feature representation, which is fixed throughout learning, and possibly diverges otherwise. We aim to answer the following questions: When the function approximator is a neural network, how does the associated feature representation evolve?


Multi-Agent Reinforcement Learning and Real-Time Decision-Making in Robotic Soccer for Virtual Environments

Taourirte, Aya, Mia, Md Sohag

arXiv.org Artificial Intelligence

The deployment of multi-agent systems in dynamic, adversarial environments like robotic soccer necessitates real-time decision-making, sophisticated cooperation, and scalable algorithms to avoid the curse of dimensionality. While Reinforcement Learning (RL) offers a promising framework, existing methods often struggle with the multi-granularity of tasks (long-term strategy vs. instant actions) and the complexity of large-scale agent interactions. This paper presents a unified Multi-Agent Reinforcement Learning (MARL) framework that addresses these challenges. First, we establish a baseline using Proximal Policy Optimization (PPO) within a client-server architecture for real-time action scheduling, with PPO demonstrating superior performance (4.32 avg. goals, 82.9% ball control). Second, we introduce a Hierarchical RL (HRL) structure based on the options framework to decompose the problem into a high-level trajectory planning layer (modeled as a Semi-Markov Decision Process) and a low-level action execution layer, improving global strategy (avg. goals increased to 5.26). Finally, to ensure scalability, we integrate mean-field theory into the HRL framework, simplifying many-agent interactions into a single agent vs. the population average. Our mean-field actor-critic method achieves a significant performance boost (5.93 avg. goals, 89.1% ball control, 92.3% passing accuracy) and enhanced training stability. Extensive simulations of 4v4 matches in the Webots environment validate our approach, demonstrating its potential for robust, scalable, and cooperative behavior in complex multi-agent domains.


Mean-field theory of graph neural networks in graph partitioning

Neural Information Processing Systems

A theoretical performance analysis of the graph neural network (GNN) is presented. For classification tasks, the neural network approach has the advantage in terms of flexibility that it can be employed in a data-driven manner, whereas Bayesian inference requires the assumption of a specific model. A fundamental question is then whether GNN has a high accuracy in addition to this flexibility. Moreover, whether the achieved performance is predominately a result of the backpropagation or the architecture itself is a matter of considerable interest. To gain a better insight into these questions, a mean-field theory of a minimal GNN architecture is developed for the graph partitioning problem. This demonstrates a good agreement with numerical experiments.


Mean-field theory of graph neural networks in graph partitioning

Tatsuro Kawamoto, Masashi Tsubaki, Tomoyuki Obuchi

Neural Information Processing Systems

A theoretical performance analysis of the graph neural network (GNN) is presented. For classification tasks, the neural network approach has the advantage in terms of flexibility that it can be employed in a data-driven manner, whereas Bayesian inference requires the assumption of a specific model. A fundamental question is then whether GNN has a high accuracy in addition to this flexibility. Moreover, whether the achieved performance is predominately a result of the backpropagation or the architecture itself is a matter of considerable interest. To gain a better insight into these questions, a mean-field theory of a minimal GNN architecture is developed for the graph partitioning problem. This demonstrates a good agreement with numerical experiments.



Neural-Quantum-States Impurity Solver for Quantum Embedding Problems

Zhou, Yinzhanghao, Lee, Tsung-Han, Chen, Ao, Lanatà, Nicola, Guo, Hong

arXiv.org Artificial Intelligence

Such systems exhibit a variety of electronic phases, including metallic, insulating, and superconducting states [1], and represent a versatile design space for technological applications in electronics, quantum computing, and sensing. Designing new correlated materials with targeted properties therefore depends on the ability to solve the many-body electronic Hamiltonian, a computationally demanding task [2]. Quantum embedding (QE) methods provide a robust framework for overcoming these challenges [3-5]. The common strategy underlying these methods is to describe each fragment of interacting orbitals through an effective model, where its complex environment is replaced by a simpler, entangled quantum bath, designed to approximate its influence [3, 6]. The link between this effective model and the original system is established through self-consistency conditions, and different embedding schemes are defined by their choice of which physical property to match [3]. Dynamic mean-field theory (DMFT) [7-14], for instance, uses frequency-dependent one-body Green's functions, whereas density matrix embedding theory (DMET) [5, 15-19] typically uses one-and two-body density matrices. The recently developed gGA [20-26] is a powerful variational method that generalizes the standard Gutzwiller Approximation (GA) [27-34] by systematically extending its variational space with auxiliary "ghost" fermionic degrees of freedom. This approach yields results in remarkable agreement with DMFT but at a much lower computational cost, as it requires calculating only the ground state of a finite-size impurity model, whereas DMFT requires the full spectra from an impurity model that corresponds to an infinite bath. Successful applications of gGA include accurate modelling of the Anderson lattice systems [21], excitonic phenomena [20], non-equilibrium systems [35] and altermagnetic systems [36], along with extensions that achieve charge self-consistency with density functional theory (DFT) [22], demonstrating its versatility and practical utility in real-material contexts.



Gaussian mixture layers for neural networks

Chewi, Sinho, Rigollet, Philippe, Yan, Yuling

arXiv.org Machine Learning

The mean-field theory for two-layer neural networks considers infinitely wide networks that are linearly parameterized by a probability measure over the parameter space. This nonparametric perspective has significantly advanced both the theoretical and conceptual understanding of neural networks, with substantial efforts made to validate its applicability to networks of moderate width. In this work, we explore the opposite direction, investigating whether dynamics can be directly implemented over probability measures. Specifically, we employ Gaussian mixture models as a flexible and expressive parametric family of distributions together with the theory of Wasserstein gradient flows to derive training dynamics for such measures. Our approach introduces a new type of layer -- the Gaussian mixture (GM) layer -- that can be integrated into neural network architectures. As a proof of concept, we validate our proposal through experiments on simple classification tasks, where a GM layer achieves test performance comparable to that of a two-layer fully connected network. Furthermore, we examine the behavior of these dynamics and demonstrate numerically that GM layers exhibit markedly different behavior compared to classical fully connected layers, even when the latter are large enough to be considered in the mean-field regime.